Red Hat Auto-Remediation Workflow Generator

Automated diagnostic and remediation workflow generation for Red Hat Enterprise Linux (RHEL) systems using AI and MCP (Model Context Protocol) servers.

Powered by OpenRouter: Access Claude, GPT-4, Gemini, and other leading models through a single API.

Overview

This project provides an end-to-end pipeline that:

Diagnoses system issues from error logs using Red Hat's Security Data API and Knowledge Base
Generates executable remediation workflows with proper error handling and approvals
Produces workflow definitions compatible with the Nexus Workflow Engine

Architecture

Error Logs
    ↓
[Diagnostic Agent]
    ↓ (uses MCP servers)
    ├── Red Hat Security Data API (CVEs, advisories)
    └── Red Hat Knowledge Base (solutions, articles)
    ↓
Diagnosis (root causes + remediation steps)
    ↓
[Workflow Generator]
    ↓
Executable Workflow Definition (JSON)
    ↓
[Workflow Engine] (execution - not included)

See ARCHITECTURE.md for detailed visual diagrams of the entire process flow, MCP server architecture, agentic research loop, and data flow.

Features

✅ Agentic Research: LLM autonomously researches using MCP tools
✅ Multi-source Intelligence: Combines CVE data + KB articles
✅ Structured Output: Generates valid workflow JSON
✅ Risk Assessment: Assigns risk levels and approval requirements
✅ Retry Policies: Automatic retry configuration based on risk
✅ Checkpointing: Saves diagnosis and workflow at each stage

Project Structure

redhat-diagnostic-workflow/
├── mcp_servers/
│   ├── redhat_security_server.py  # MCP server for Security Data API
│   └── redhat_kb_server.py         # MCP server for Knowledge Base API
├── diagnostic_agent/
│   ├── diagnostic_agent.py         # Main diagnostic agent
│   ├── workflow_generator.py       # Workflow generator
│   └── pipeline.py                 # Complete orchestration pipeline
├── examples/
│   ├── nginx_openssl_error.log     # Example: nginx segfault
│   └── systemd_timeout_error.log   # Example: systemd/PostgreSQL issue
├── requirements.txt                # Python dependencies
├── .env.example                    # Environment variable template
├── test_redhat_access.py           # Red Hat API connectivity test
├── run_demo.sh                     # Demo script
├── README.md                       # This file
├── QUICKSTART.md                   # Quick start guide
├── OPENROUTER.md                   # OpenRouter setup and usage
├── TESTING.md                      # Testing guide and troubleshooting
├── ARCHITECTURE.md                 # Visual architecture and process flow
└── CHANGELOG.md                    # Project changelog

Prerequisites

Python 3.10+
OpenRouter API Key (Get here) ← Required
- Access to Claude, GPT-4, Gemini, and more
- No waitlist, pay-as-you-go pricing
- See OPENROUTER.md for setup and model selection
(Optional) Red Hat Customer Portal credentials for authenticated KB access
Nexus Workflow Schema (if using with Nexus)

Installation

1. Clone or navigate to the project directory

cd ~/scratch/redhat-diagnostic-workflow

2. Create virtual environment

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install dependencies

pip install -r requirements.txt

4. Set up environment variables

cp .env.example .env

Edit .env and add your API key:

# OpenRouter API Key (required)
OPENROUTER_API_KEY=sk-or-v1-...

# Optional: Red Hat credentials for KB access
REDHAT_USERNAME=your-username
REDHAT_PASSWORD=your-password

Note: Red Hat credentials are only needed for authenticated KB endpoints. The Security Data API is public.

Testing Red Hat API Access

Before running the full pipeline, you can test connectivity to Red Hat APIs without requiring an LLM API key:

# Activate virtual environment
source venv/bin/activate

# Run Red Hat API access tests
python test_redhat_access.py

This will test:

Red Hat Security Data API (public, no auth required)
- CVE lookups
- Security advisories
- Package vulnerability searches
Red Hat Knowledge Base API (optional auth)
- KB article search
- Solution lookups
MCP server file checks

Expected output:

========================================
RED HAT API ACCESS TEST
========================================

TEST 1: Red Hat Security Data API (Public)
========================================

Get specific CVE
   Testing CVE lookup for CVE-2024-6387 (OpenSSH vulnerability)
   URL: https://access.redhat.com/labs/securitydataapi/cve/CVE-2024-6387.json
   SUCCESS (200 OK)
      CVE ID: CVE-2024-6387
      Severity: High
      CVSS3 Score: 8.1

...

TOTAL: 5/5 tests passed
All tests passed! Red Hat API access is working.

If tests fail, check:

Internet connectivity to access.redhat.com
Firewall settings
Red Hat credentials (for KB API tests)

See TESTING.md for detailed testing guide and troubleshooting.

Usage

Basic Usage

# Activate virtual environment
source venv/bin/activate

# Run pipeline with example error log
python diagnostic_agent/pipeline.py \
  --logs examples/nginx_openssl_error.log \
  --schema /path/to/workflow-definition.schema.json \
  --output-dir ./output

Advanced Usage

# Use custom session ID
python diagnostic_agent/pipeline.py \
  --logs examples/systemd_timeout_error.log \
  --schema /path/to/workflow-definition.schema.json \
  --session-id "incident-2025-12-03-001" \
  --output-dir ./output

# Pass error message directly (not from file)
python diagnostic_agent/pipeline.py \
  --logs "nginx segfault in libssl.so" \
  --schema /path/to/workflow-definition.schema.json

Output Structure

After running, the pipeline creates:

output/
└── incident-20251203-142345/
    ├── error_logs.txt         # Original error logs
    ├── diagnosis.json         # Diagnostic results
    ├── workflow.json          # Generated workflow
    └── summary.json           # Complete session summary

MCP Servers

Red Hat Security Server

Provides access to Red Hat Security Data API:

Tools:

search_cve: Search CVEs by ID or package name
get_rhsa: Get security advisory details
search_affected_packages: Find affected packages for a CVE
get_errata: Get errata information

Example standalone usage:

python mcp_servers/redhat_security_server.py

Red Hat Knowledge Base Server

Provides access to Red Hat Customer Portal KB:

Tools:

search_kb: Search KB articles
get_kb_article: Get full article by ID
search_solutions: Search for error message solutions
search_by_symptom: Search by symptom description

Example standalone usage:

export REDHAT_USERNAME=your-username
export REDHAT_PASSWORD=your-password
python mcp_servers/redhat_kb_server.py

Example Workflows

See examples/README.md for complete documentation of all 9 example scenarios.

Example 1: Nginx OpenSSL Vulnerability

Input (examples/nginx_openssl_error.log):

ERROR nginx: worker process exited on signal 11 (core dumped)
ERROR kernel: nginx[1234]: segfault in libssl.so.1.1

Diagnosis:

Root cause: Vulnerable OpenSSL 1.1.1k (CVE-XXXX-YYYY)
Severity: High
Evidence: Segfault in libssl + CVE match

Generated Workflow:

Backup nginx configuration (script, low risk)
Stop nginx service (script, high risk, requires approval)
Upgrade OpenSSL (ansible, high risk, requires approval)
Restart nginx (script, medium risk)
Verify health (API call, low risk)

Example 2: PostgreSQL SELinux Denial

Input (examples/systemd_timeout_error.log):

ERROR systemd: postgresql.service: Start operation timed out
ERROR postgresql: could not open file: Permission denied
ERROR selinux: AVC denial: denied read access

Diagnosis:

Root cause: SELinux context mismatch on PostgreSQL data directory
Severity: Medium
Evidence: Permission denied + AVC denial

Generated Workflow:

Check current SELinux context (script, low risk)
Restore correct SELinux context (script, medium risk, requires approval)
Restart PostgreSQL (script, medium risk)
Verify database accessibility (API call, low risk)

Workflow Schema Compatibility

The generated workflows match the Nexus Workflow Engine schema:

{
  "schemaVersion": "1.0.0",
  "version": 1,
  "metadata": {
    "name": "auto-remediation-20251203-142345",
    "description": "Fix nginx segfault due to CVE-XXXX-YYYY",
    "tags": ["auto-remediation", "redhat", "CVE-XXXX-YYYY"]
  },
  "triggers": [{"type": "manual", "requiresApproval": true}],
  "workflow": {
    "activities": [...]
  }
}

Configuration

Retry Policies

Automatically configured based on risk level:

High risk: 1 attempt, fixed backoff
Medium risk: 2 attempts, exponential backoff
Low risk: 3 attempts, exponential backoff

Approval Requirements

Activities requiring approval:

All high-risk operations
Service restarts
Package upgrades
Manual intervention steps

Approval timeout: 10 minutes (configurable)

Troubleshooting

Issue: "Authentication required" for KB search

Solution: Set Red Hat credentials in .env:

REDHAT_USERNAME=your-username
REDHAT_PASSWORD=your-password

Issue: "CVE not found in Red Hat database"

Cause: The CVE may not affect Red Hat products or hasn't been analyzed yet.

Solution: The agent will fallback to KB article search.

Issue: MCP server connection fails

Solution: Ensure Python path is correct in diagnostic_agent.py:

StdioServerParameters(
    command="python",  # or "python3"
    args=["path/to/server.py"]
)

Issue: Workflow validation fails

Cause: Generated workflow doesn't match schema.

Solution: Check workflow-definition.schema.json path and ensure it's the correct version.

API Rate Limits

Red Hat Security Data API

Public: No authentication required
Rate limit: Reasonable use (no official limit documented)

Red Hat Customer Portal API

Authentication: Required for some KB endpoints
Rate limit: Not publicly documented

Development

Running Tests (Coming Soon)

pytest tests/

Adding New MCP Tools

Edit mcp_servers/redhat_security_server.py or redhat_kb_server.py
Add new tool to @app.list_tools()
Implement handler in @app.call_tool()
Update agent prompt in diagnostic_agent.py

Limitations

Execution: Workflow execution not implemented (generates definitions only)
Ansible playbooks: Discovery works, but actual playbooks not included
System access: Cannot directly query the failing system
Context limits: Very large log files may need pre-processing

Future Enhancements

Integration with Ansible Galaxy for playbook discovery
Real-time log streaming support
Multi-server diagnostics (cluster-wide issues)
Workflow execution engine integration
Automated rollback on failure
Metrics and observability

References

License

MIT License (or your preferred license)

Support

For issues or questions:

Check the troubleshooting section above
Review example error logs in examples/
Consult Red Hat API documentation

Built with: OpenRouter, MCP (Model Context Protocol), Red Hat APIs

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
diagnostic_agent		diagnostic_agent
examples		examples
mcp_servers		mcp_servers
tests		tests
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
COMPACTION_TESTING.md		COMPACTION_TESTING.md
NEW_EXAMPLES_SUMMARY.md		NEW_EXAMPLES_SUMMARY.md
OPENROUTER.md		OPENROUTER.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
TESTING.md		TESTING.md
create_workflows.py		create_workflows.py
requirements.txt		requirements.txt
run_demo.sh		run_demo.sh
test_all_examples.sh		test_all_examples.sh

Folders and files

Latest commit

History

Repository files navigation

Red Hat Auto-Remediation Workflow Generator

Overview

Architecture

Features

Project Structure

Prerequisites

Installation

1. Clone or navigate to the project directory

2. Create virtual environment

3. Install dependencies

4. Set up environment variables

Testing Red Hat API Access

Usage

Basic Usage

Advanced Usage

Output Structure

MCP Servers

Red Hat Security Server

Red Hat Knowledge Base Server

Example Workflows

Example 1: Nginx OpenSSL Vulnerability

Example 2: PostgreSQL SELinux Denial

Workflow Schema Compatibility

Configuration

Retry Policies

Approval Requirements

Troubleshooting

Issue: "Authentication required" for KB search

Issue: "CVE not found in Red Hat database"

Issue: MCP server connection fails

Issue: Workflow validation fails

API Rate Limits

Red Hat Security Data API

Red Hat Customer Portal API

Development

Running Tests (Coming Soon)

Adding New MCP Tools

Limitations

Future Enhancements

References

License

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages